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Abstract. In the last decade, there has been a substantial amount of 
research in finding routing algorithms designed specifically to run on 
real-world graphs. In 2010, Abraham et al. showed upper bounds on the 
query time in terms of a graph’s highway dimension and diameter for 
the current fastest routing algorithms, including CONTRACTION hierar¬ 
chies, TRANSIT NODE ROUTING, and HUB LABELING. In this paper, we 
show corresponding lower bounds for the same three algorithms. We also 
show how to improve a result by Milosavljevic which lower bounds the 
number of shortcuts added in the preprocessing stage for contraction 
HIERARCHIES. We relax the assumption of an optimal contraction order 
(which is NP-hard to compute), allowing the result to be applicable to 
real-world instances. Finally, we give a proof that optimal preprocess¬ 
ing for HUB LABELING is NP-hard. Hardness of optimal preprocessing is 
known for most routing algorithms, and was suspected to be true for 
HUB LABELING. 


1 Introduction 

The problem of finding shortest paths in road networks has been well-studied in 
the last decade, motivated by the application of computing driving directions. 
Although Dijkstra’s algorithm runs in small polynomial time, for applications 
involving continental-sized road networks, Dijkstra’s algorithm is simply not 
fast enough. There have been many different approaches to find algorithms that 
specifically run fast on real-world graphs. 

Most recent innovations involve a two-stage algorithm: a preprocessing stage 
and a query stage. The preprocessing stage runs once and can spend hours 
calculating data. Then the query stage uses this data to find shortest paths very 
fast, often several orders of magnitude faster than Dijkstra’s algorithm for a 
continental query. Once the preprocessing stage is completed, the users can run 
as many queries as they want. For a query between two nodes s and t (an s-t 
query), the algorithm returns dist(s,t), the cost of the shortest path between s 
and t. Most algorithms can also return the vertices on the shortest path using 
an extra data structure. 

The current fastest routing algorithm on real-world graphs is hub labeling 
[2], which achieves a speedup of six orders of magnitude over Dijkstra’s algo¬ 
rithm. The TRANSIT NODE ROUTING algorithm is second-fastest, and requires 


an order of magnitude less space than hub labeling, contraction hierar¬ 
chies is also a fast routing algorithm, which was state of the art in 2008. For a 
comprehensive overview of the best routing algorithms, see [B]. 

Until recently, it was known that these algorithms performed very well on 
real-world maps, but there were no theoretical guarantees. In fact, it is not hard 
to construct specific graphs for which these algorithms perform no faster than 
Dijkstra’s algorithm. So, an interesting theoretical question is to find properties 
present in all real-life graphs that explain why these algorithms work so well. 

With this motivation in mind, Abraham et al. defined the notion of highway 
dimension [1], intuitively, the extent to which all shortest paths are hit by at 
least one of a small set of access nodes. Although it is too computationally 
intensive to calculate the exact highway dimension for a continental road map, 
there is evidence that the highway dimension h is at most polylogarithmic in 
the number of vertices. It is conjectured that real-world routing networks always 
have low highway dimension, based on experimental evidence [3]. Abraham et 
al. were able to prove strong upper bounds on the query times in terms of 
highway dimension and diameter d for four of the fastest routing algorithms: hub 
LABELING, CONTRACTION HIERARCHIES, TRANSIT NODE ROUTING, and REACH. 


1.1 Our results 

In this paper, we are interested in finding lower bounds for the current state-of- 
the-art routing algorithms. We show tight or near-tight bounds on the runtime 
for HUB LABELING, CONTRACTION HIERARCHIES, and TRANSIT NODE ROUTING. 

Our lower bounds may facilitate proving better guarantees of these algo¬ 
rithms, or provide intuition for new routing algorithms, if one can find differences 
between the graphs we use and real world instances. For example, the graphs we 
use have low highway dimension, but they do not have small separators and are 
nonplanar, so perhaps there is a way to modify hub labeling to take this into 
account. 

We show a tight lower bound for hub labeling, the fastest routing algorithm 
to date [5]. For CONTRACTION HIERARCHIES and TRANSIT NODE ROUTING, the 
definition of highway dimension in the lower bound versus upper bound is slightly 
different (because of a recent redefinition by Abraham et al. ), so we cannot quite 
say the bounds are tight. 

We can also use our analysis to generalize a known result by Milosavljevic, 
which lower bounds the number of shortcut edges in the preprocessing stage 
of CONTRACTION HIERARCHIES [12] . This result assumes an optimal contraction 
order which is NP-hard to compute [7|. So for real-world instances, we rely on us¬ 
ing contraction orders based on heuristics. We show how to relax the assumption 
about the contraction order, which means the result can be applied to real-world 
instances. 

We also contribute a hardness result for optimal preprocessing of hub la¬ 
beling. In 2010, Bauer et al. established hardness for optimal preprocessing for 
a variety of the best routing algorithms, including CONTRACTION hierarchies 
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and TRANSIT NODE ROUTING. In this paper, we show that in hub labeling pre¬ 
processing, the problem of minimizing the maximum label size over all vertices 
is NP-hard. 

This paper will proceed as follows. Section 2 will provide preliminary infor¬ 
mation, specifically about highway dimension, and also the graph construction 
used in our main theorems. In Section 3, we show a lower bound on the query 
time of the hub labeling algorithm, and prove that optimal preprocessing 
is NP-hard. In Section 4, we establish a lower bound on the query time for 
CONTRACTION HIERARCHIES, and generalize a lower bound on the number of 
shortcut edges added in the preprocessing phase. Section 5 establishes a lower 
bound on the query time of transit node routing. We conclude and discuss 
future directions in Section 6. 


2 Preliminaries 

In this paper, we assume nonnegative integral edge lengths and unique shortest 
paths. We will also assume graphs are undirected in all sections except for the 
hardness result. These are standard assumptions to make when proving bounds 
on routing algorithms, for example, [3] and |12) . 

Br{v) represents all nodes u such that dist(u, u) < r. We say a set of nodes 
covers a set of paths if each path has at least one of its vertices in the set of 
nodes. 


2.1 Highway Dimension 

Now we will formally define the notion of highway dimension. 

The highway dimension of a graph G = {V, E) is the smallest h such that for 
all r > 0 and for all B 4 r{v), there exists a set H C V, such that \H\ < h and H 
covers all shortest paths of length > r in B 4 r{v). 

Highway dimension was specifically designed to explain why the best rout¬ 
ing algorithms perform well on real-world graphs but do not perform well on 
arbitrary graphs. Although it is too computationally intensive to calculate the 
exact highway dimension of a continental-sized road network, it is conjectured 
that the highway dimension of real-world graphs is at most polylogarithmic in 
the number of vertices [3]. 

Abraham et al. introduced a slightly refined version of the original highway 
dimension in 2013 [T]. 

The difference in the new definition versus the old one is that instead of 
having to hit all local shortest paths of length > r, we have to hit all paths P 
where there is a shortest path P' with endpoints s and t such that 1{P') > r, 
P ^ P', and P' \ P G {0, {s}, {t}, {s, t}}. That is, we have to hit all paths that 
can be obtained by removing zero, one, or both endpoints of a shortest path 
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with length > r. We will refer to a graph’s highway dimension as h for the first 
definition, and h for the second definition. 

The two definitions of highway dimension are very similar but have a few key 
differences. Most notably, the new definition bounds the degree of the graph, 
which was not true before j3] . The new definition of highway dimension allowed 
Abraham et al. to improve their results on the runtime of routing algorithms. 

2.2 Definition of Gt,k,q 

Now we will define the family of graphs Gt^k,q that will be used in many of our 
proofs. Gt^k,q was designed to by Milosavljevic to show a lower bound on the 
number of shortcuts created during the preprocessing stage of CONTRACTION 
HIERARCHIES m- 

Consider a complete t-ary tree of height k for integers t,k > 2. Let X{v) 
denote the height of node v, and let X{u,v) denote the height of the lowest 
common ancestor between two nodes u and v. 

Now define the edges as follows: for all nodes v and w such that ic is a proper 
ancestor of v, there is an edge between v and w with length This means 

the edge length from a node w to one of its descendants v is independent of X{v). 
Furthermore, edge lengths increase for nodes higher up in the tree. 

Denote this graph by Gt^k = {Vt^k,Et^k)- See Figurefor an example. For 
convenience, we will still refer to this graph as a tree, even though the additional 
edges create cycles. 

Now we will define Gt,k,q = {Vt,k,q, Et^k,q) by taking q copies of Gt^k, and 
naming them g[°‘ 1 = (Yt^k ^e[“‘1) for a = 1,2,..., q. The copy of a node v G Gt^k 
in G^^l is denoted vG). 

For all V G Gt,k and a ^ b, we add edge yGEyit) Et^k,q with length 
2 Hv)-k-i^ This ensures that switching copies has a low penalty is 

always less than 1), and it is always cheaper to switch among copies lower down 
in the tree. See Figure [l] for an example. 




Fig. 1 . The left graph is Ga^s, and the right graph is G2,3,2 
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2.3 Properties of Gt,k,q 

We will now discuss properties of Gt^k,q- The following three lemmas are proven 
in [12] . 

Lemma 1. Given s,t G Vt^k with lowest common ancestor w, the unique short¬ 
est s-t path is s^-t. 

Lemma 2. Given and in Gt,k.q, 1st w be the lowest common ancestor 
between s and t. Then the shortest paths are: 

g{a) _g(b) ^{b) _^{b) ^ z/A(s) < X{t), and/or 
s(“)-u;(“)-t(“)-t(^), if X{t) < A(s). 

Lemma 3. The hiqhwav dimension h of G* k n is equal to q, the diameter D is 
0(16'=), and\Vt^k,q\ = Oiqt’^). 

It is worth noting that at the start we assumed graphs have unique short¬ 
est paths, but now many shortest paths in our main family of graphs are not 
unique. However, this is a common assumption in routing algorithm proofs be¬ 
cause it is not hard to perturb the input to make all shortest paths unique while 
maintaining the validity of the proofs. 

Additionally, integrality of edge lengths is violated. Since the smallest edge 
is 2“'= (and all edge lengths are multiples of this), all of the edge weights can be 
multiplied by 2^ to create integral lengths. This will increase H by a factor of 
k, doubling logH, which will not affect our results. 

3 Hub Labeling 

The HUB LABELING algorithm was first devised in 2004 by Gavoille et al. uni, 
and further studied by Cohen et al. [S] . However, the algorithm was not practical 
for continental routing queries until 2011, when Abraham et al. came up with 
an efficient way to perform the preprocessing and query phases, which made it 
the fastest routing algorithm to date [2]. 

In this section, we will first give an introduction to the hub labeling algo¬ 
rithm. Then we will present a lower bound on the query time. Finally, we will 
show the preprocessing phase is NP-hard to optimize. 

3.1 The algorithm 

HUB LABELING relies on the concept of labeling. Each node stores information 
about its shortest paths that allows us to reconstruct the shortest path during 
a query. This idea is used in a clever way to make queries run very fast. 

In the HUB LABELING algorithm, we give each node n € H a label consisting 
of other nodes (the hubs of v), and we store the shortest distances to the hubs 
from V. We define a labeling L as the set of labels L{v) for all v G V. 

We construct the labeling in such a way that for any pair of nodes s and t, 
L{s) n L{t) contains at least one node on the shortest path from s to t. When 
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satisfied, this is called the cover property. Then in order to perform an s-t query, 
we only need to find the v e L(s)nL(<) that minimizes dist(s, u)+dist(u, t). This 
can be made to take 0(|L(s)| + |T(t)|) time if the labels are sorted with some 
arbitrary node order. This process returns dist(s, t). To return the nodes on this 
shortest path, we need to add another data structure in the preprocessing stage, 
which does not increase the space complexity by more than a constant factor 

m- 

In Section we will show that it is NP-hard to find the labeling that min¬ 
imizes the maximum label size for all vertices. This was suspected to be true. 
Therefore, in practice we must rely on heuristics in the preprocessing stage. 

Abraham et al. showed that the query time of hub labeling is 0{h\ogD)^ 
using a specihc labeling [T]. The proof did not use any properties of h that are 
different from h, so we can also say that the query time is 0{hlogD). 

It is not known how to construct the labeling used in their proof in polynomial 
time, so they showed a corollary that uses a polynomial preprocessing algorithm 
and permits queries to be handled in 0{h\oghlogD) time. 

3.2 Lower bounding the query time 

We cannot prove a lower bound on the minimum query time, since labelings can 
be constructed to make any one query run in constant time. Instead, we will 
prove a bound on the average query time by bounding the sum of all label sizes. 

Theorem 1. For all h, D, n, there is a graph G = {V,E) with highway dimen¬ 
sion h, diameter 0{D), and \V\ > n, such that for any choice of labeling L, the 
average query requires Fl{h\ogD) time. 

Proof. We will show that Gt,k.q satisfies the desired requirements, with t, k, and 
q to be defined at the end of the proof. 

Consider different classes of shortest paths between pairs of leaves distin¬ 
guished by the height of their lowest common ancestor as follows. 

For 0 < i < k, let Pi = {s-t \ s and t are leaves, and A(s, t) = i}. 

Let J2vev C*ur goal is to show that a constant fraction of the 

fc -I- 1 sets Pq, Pi, ...,Pk each contribute n{q^t^) distinct nodes to the sum P[. 

We make the assumption that all the neighbors of a leaf and the leaf 
itself, are in that leaf’s label. That is, contains for all b (even when 

b = a), and contains for all ancestors w of v. These are k-\- q-\-l nodes per 
leaf and t^{k -f g -|- 1) total nodes, which is asymptotically less than Q{t^q^k), 
the desired result. Therefore, this assumption will not affect the validity of our 
proof. 

Now consider an arbitrary path in Pi. Label the endpoints of the shortest 
path Pi by and From Lemma[^ Pi must equal where 

w is the lowest common ancestor of s and t, and X{w) = i. 

n L(t^^l) must contain at least one of s^’’\ t^^'^ in order 

to satisfy the cover property. By our assumption above, 5*-“^ S L{s^°’'>) and 
g Now there are four cases. 
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Case 1: Note that is not on any other shortest path starting 

at 

Case 2: Again, is not on any other shortest path starting 

at s^°'\ 

Case 3: e «('') is on all leaf-leaf shortest paths (that end at 

of the form for c^b. There are q — 1 such paths in Pi. 

Case 4: e L(s(“)). is on all leaf-leaf shortest paths (that start at 

s^“^) of the form for v such that A(s, u) = i. There are — 

such paths, since there are t* leaves with as an ancestor, and all but of 
those leaves have as the lowest height ancestor to get to s^°'\ 

Furthermore, 


\P\ = 




q{q- l)r(t* 


( 1 ) 


because there are (|) ways to pick two copies of trees, choices for the first leaf, 
and t* — choices for the second leaf (in order to guarantee that the leaves 
have a lowest common ancestor of height i). 

So if we assume V — > g — 1 (we will explain in the next paragraph why 

we can make this assumption), then we can achieve a lower bound on the number 
of labels needed for Pi by exclusively using Case 4 for our choice of labels. 

g(g-l)t^(f-f-^) . _ g(g-l)t" ^ ^2) 

Therefore, the contribution of Pi to the total sum H is at least . For 

all i, the hubs that Pi contributes to the sum H have height i, ensuring that a 
node does not get double counted in H. 

Let k = q = h, and pick t big enough such that > n (ensuring 

that \V\ > n) and > q (ensuring that at least half of the P^’s satisfy 

ti > g - 1). 

Then the highway dimension of G is ft. and the diameter is 0{D). Recall that 
\V\ S 0{qt^). Then for any given labeling L, 

E ^ I • e C(ft|R|logP). (3) 


This completes the proof since query times depend on the size of the labels. 

□ 


With this theorem, the upper bound presented in [I] becomes tight. 


3.3 Hardness of preprocessing 

In 2010, Bauer et al. established hardness for optimal preprocessing for a vari¬ 
ety of the best routing algorithms, including contraction hierarchies and 
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TRANSIT NODE ROUTING [7j. We provide hardness for optimal preprocessing in 
HUB LABELING which was suspected to be true [3]. By optimal preprocessing, we 
mean minimizing the maximum hub size over all vertices. Babenko et al. very 
recently established hardness for nearly the same problem, but they defined op¬ 
timal preprocessing as minimizing over the total label size [S] . Our definition of 
optimal corresponds to minimizing the maximum query time, whereas the other 
definition corresponds to minimizing the average query time. 

We will switch to directed graphs, which was the original setting of hub 
LABELING [2]. The main difference is that each node v has a forward label Lf{v) 
and a reverse label Lj.(y), and the cover property states that for a directed s-t 
query, Lf{s) H Lr{t) is not empty. 

Now we formally define the problem minimum hub labeling (MHL) as fol¬ 
lows: 

Problem (MHL). Given a directed graph G = (V, A) and an integer fc, find a la¬ 
beling L satisfying the cover property such that max„g u(max(|L/(u)|, \Lr{v)\)) < 

k. 


We will show a reduction from a classical NP-hard problem, exact cover by 
3-sets (X3C). In an X3C instance {U,C), C/ is a set of elements, 3 divides |C/|, 
and C is a set of triples of U. The problem is whether there exists a set C" C C, 
\C'\ = such that C' covers U (an exact 3-covering of U). 

Here is an outline of the proof. Given an X3G instance {U,C), we create an 
MHL instance {G,k) where G = {V, E), U U C C V and for c G C, u G U, 
c-u G E u G c. 

We also add a clique of vertices {&i,..., 62 ^- 1 } = B with arcs to nodes in H, 
whose sole purpose is to fill up the reverse labels of nodes in U. Finally, we add 
two vertices { 01 , 02 } = A with arcs to every node in C. 

By filling up the reverse labels of nodes u G U, we force the nodes a G Aio 
use nodes in C or {7 for the hubs of a-u shortest paths. And it is too inefficient 
to use nodes in U for the hubs, so nodes in G must act as the hubs. Then in 
order for A’s label size to stay < k, there must be an exact cover for U. 

Theorem 2. Minimum hub-labeling is NP-hard. 

First we construct a graph G, and then prove lemmas about its labeling until 
we work up to proving the theorem. 

Given an X3C instance {U,G), we create an MHL instance {G,k) where 
G = iV,E), V = AUGUUUB, |A| = 2, and \B\ = ||t/| -f 1. For all a S A and 
c G G, there is a directed edge (a, c) G E. For all u G U and c G C such that 
u G c, there is a directed edge (c, m ) G E. For all 61,62 G B such that 61 62 , 

( 61 , 62 ) and ( 62 , 61 ) are in E. Let B' be a subset of B such that \B'\ = — 1 
(it does not matter which 6 ’s are in B'). For all 6 ' G B' and all u G U, there is a 
directed edge ( 6 ',u) G E. All edges are unit length. Finally, set fc = ^ -|- 1. See 
Figure [3Al 



Fig. 2. The MHL instance constructed from X3C. 


First we prove the forward direction: if (G, k) is a yes instance, then {U, C) 
is a YES instance. We prove this using a few different lemmas. 

Lemma 4. If {G,k) is a yes instance, then for all b G B, Lf{b) and Lr{b) 
contain k vertices from B. 

Proof. Given bi, 62 G B, the shortest & 1-62 path is the edge & 1 - 62 ) since B is 
fully connected. Then to satisfy the cover property, either bi £ L/(&i)nLr(& 2 ) or 
62 G L/(5i) n Lr{b2). Each of these cases puts one vertex in a label that cannot 
be reused for any other shortest path {bi G Lr-(& 2 ) or 62 G Lf{bi)). 

First we note that for all & G i?, 6 G Lf{b) and b G Lr{b). If this were not the 
case, (WLOG b ^ Lf{b)), then Lf{b) must contain \B\ — 1 = ||I/| > k vertices 
to satisfy all its outgoing shortest paths, which contradicts our assumption. 

Now we note there are |i?|(|i?| — 1) total shortest paths, and each requires 
adding exactly one node to a label that cannot be reused for any other shortest 
path. Then the minimum max label we can hope to achieve is + 1, which 
corresponds to splitting the |i3|(|i3| — 1) vertices equally among B. So each 
forward and reverse label has size | = |t^l/3)j plus the self hub to 

reach a total of + 1 = /c. □ 

Corollary 1. If (G, k) is a yes instance, then for all u G U, B' C Lf{u). 

Proof. Given u G U, b' G B', the shortest u-b' path is the edge u-b'. Then to 
satisfy the cover property, either u G Lf{u)(^Lr{b'), or b' G Lf{u)(^Lr{b'). From 
Lemma 1^ we know that Lr{b) already contains k vertices from B. Therefore, it 
must be the case that b' G Lf{u). Then for all u G U, B' C Lf{u). □ 
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So, now we know that the reverse labels for nodes in U are almost full up. To 
finish off the forward direction, we need to show that the only way for vertices 
in A to have hubs < A: is to use an exact cover C for U. Intuitively, it makes 
sense that the a-c-u shortest paths should use the vertices in C as hubs rather 
than vertices in A or U, because it can be used for three shortest paths instead 
of just one. However, we need to make certain that some hybrid label with H’s, 
C’s, and C/’s does not work. 

Define Ai = \{u G U \ u G Lr(ai)}| and A 2 = \{u GU \ u G Lr{a 2 )}\- 
Also let Ui = \{u GU \ ai G Lr{u)}\, and U 2 = \{u G U \ a 2 G Lr{u)}\. 

Lemma 5. If (G, k) is a yes instance, then Ai = A 2 = Ui = U 2 = 0. 

Proof. From Lemma we know that for all u GU, Lr{u) contains \B'\ = k — 2 
vertices from B'. Lr{u) will also need at least one total vertex for all the c-u 
shortest paths, for c such that u G c. Therefore, we cannot put both ai and 02 
into Lr(u). 

Then, for every u G U such that ai G Lr(u), u must be in Lf(a 2 ), or else 
there would be no other way for the a 2 -u path to satisfy the cover property. 
Therefore, Ui < A 2 . Similarly, U 2 < Ai. 

Now consider oi’s forward label. L/(ai) will need at least one total vertex for 
all the a-c shortest paths. Since there are Ai vertices u GU such that u G Lf{ai), 
there is room in Lf{ai) for k — I — Ai vertices, and we need to satisfy the cover 
property for \U\ — Ui more shortest paths of the form oi — w such that u G U. The 
most efficient label for these shortest paths is to pick a vertex in G, which will 
cover three at a time. Then we must have \U\ — Ui < 3(A:—1 — Ai) = \U\ — 3Ai, 
from which it follows that 3Ai < Ui < A 2 . With the exact same argument, we 
get 3 A 2 < Ai. Then 9Ai < Ai and so Ai = 0. □ 

Corollary 2. If {G,k) is a yes instance, then {U,C) is a yes instance. 

Proof. From Lemma it follows that for all a G A and u GU, there exists a 
c G C, such that u G c and c G Lf{a) D Lr{u). Then there must be at least 
^ vertices from G in Lf(a). Lf{a) also needs a hub for all a-c shortest paths 
where c G G. The only way to accomplish that is to let a be the hub. Then Lf{a) 
contains a, plus some C' C C such that for all u G U, there exists a c G G' such 
that u G c. Since (G, k) is a yes instance, |G'| < fc — 1 = ■^. But then C is an 
exact cover for U, so {U, C) is a yes instance. □ 

Now we will show the backward direction. Proving the forward direction 
alludes to a specific labeling, so now it is just a matter of showing this labeling 
is actually possible. 

Lemma 6. If (U, G) is a YES instance, then (G, k) is a yes instance. 

Proof. Let G' C G be an exact cover for G. Given u G U, denote as the 
element in G' that covers u. We present the following labeling L. 

For a G A, Lf{a) = {a} U G', and Lr{a) = 0. 
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For c G C, Lf{c) = {u \ u £ c} and Lr{c) = A. 

For u G U, Lf{u) = 0 and Lr{v) = {u, c^} U B'. 

For b, e B, Lf{bi) = {b„bi+i,.. .,b,+k mod 2 k} and 

Lj-{b^ = —1, ■ ■ • , bi—k mod 2k} ■ 

It is easy to check this labeling satisfies the cover property. Each a-c shortest 
path uses a as a hub. Each a-u shortest path uses as a hub. Each c-u shortest 
path uses u as a hub. Each b'-u shortest path uses u as a hub. Given bi, bj G B. 
li j < i + k mod 2k, then the bi-bj shortest path uses bj as a hub. Otherwise, 
it uses bi as a hub. 

Also, it it clear that every label has size < k. This completes the proof. □ 
The proof of Theorem follows directly from Corollary and Lemma 

4 Contraction Hierarchies 

Contraction hierarchies m is a shortcut-based algorithm, making it fun¬ 
damentally different from hub labeling. It works by running bidirectional Di- 
jkstra search, pruning the searches based on a node’s importance. 

In this section, we explain how the CONTRACTION hierarchies algorithm 
works, prove a lower bound on the query time, and then generalize a result about 
the number of shortcut edges added in the preprocessing phase. 

4.1 The algorithm 

In the preprocessing stage for CONTRACTION hierarchies, we iteratively con¬ 
tract nodes using a predefined ordering, called a contraction ordering. The con¬ 
traction operation called on v first deletes v from the graph, and then may 
add edges between u’s neighbors if they are needed to preserve the shortest path 
lengths. Any such edge is put into a set E+. We contract every node in the graph 
based on the ordering, and we are left with the set £’+ of “shortcut edges”. 

To run an s-t query, run bidirectional Dijkstra search from s and t on the 
graph G+ = {V,EU E+), except at node v, only consider edges v-w in which 
w was contracted after v. When there are no more nodes to consider in either 
direction, find the node v that minimizes the sum of its distances to s and to t. 

In m, it is proven that v is guaranteed to be on the shortest path between 
s and t, which means that dist(s, t) = dist(s, v) -f dist(u, t), so the query returns 
the shortest s-t path. 

Note that any contraction ordering will give correct queries, but a better con¬ 
traction ordering will make |A“'"| small, decreasing time and space requirements. 
Finding the optimal ordering is NP-hard [7], but there are fast heuristics that 
make | E +1 within log h of optimal [ 1 ] . 

Abraham et al. showed an upper bound on the query time of CONTRAC¬ 
TION hierarchies that depends on A: O{{A h log D){h log D)) [3]. Using the 
new definition of highway dimension, Abraham et al. achieved the better bound 
of 0{{hlog D)'^) time. Both of these assume optimal preprocessing. If a poly¬ 
nomial time preprocessing algorithm is required, the bounds are modified to 
0{{hloghlog D)'^) and 0{{A h log h log D){h log h log D)). 
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4.2 Lower bounding the query time 

We show a lower bound using the old definition of highway dimension. 

Theorem 3. For all h, D, n, there is a graph G = (V, E) with highway di¬ 
mension h, diameter 0{D), and \V\ > n such that the average query time is 
l7((hlogL>)^) for CONTRACTION HIERARCHIES. 

Our strategy will be to find a lower bound assuming Abraham et al.’s (opti¬ 
mal) ordering, and then show that modifying the ordering can only increase the 
runtime. 

|12) provided a criterion for shortcut paths in the optimal ordering: the path 

sG)-sib)_w{b)_^(b) 

is shortcut if and only if a ^ b, w is a. proper ancestor of s, 
and is contracted before s^°'\ First we present a proof sketch, and then we 
give the formal proof. 

Here is an outline of the proof. Again we will use Gt^k,q, and we limit our 
analysis to leaf-leaf queries, which make up the majority of all queries. First we 
prove the theorem assuming Abraham et al.’s contraction order. For Gt^k,q, this 
means nodes are contracted based on their height in the tree. In the forward 
search of a leaf-leaf query the only nodes we may visit are ancestors 

of s such that is contracted after v^°’\ Then half of these nodes will 
have lower contraction order than the other half, and so it can be shown that 
the shortcut criterion guarantees Q{q^kf) edges will be created along half of the 
forward searches. 

Then we show that veering away from this ordering will only increase the 
number of shortcut edges produced (or slightly decrease, but not by more than 
a constant factor). This is more technical. The main idea is to carefully examine 
the effects of contracting a node higher up in the tree, before all of its descen¬ 
dants were contracted. Although contracting a higher node v decreases some of 
the paths from any descendant u to v, it creates shortcuts between all pairs of 
descendants which have not yet been contracted, which could cause an exponen¬ 
tial number of extra edges to be created. The overall difference does not increase 
the big-Omega bound from Abraham et al.’s contraction order. 

Proof. We will show Gt^k.q satisfies the properties, defining t,k,q at the end of 
the proof. Consider a query between two leaves and such that A(s, t) = k 
and a b. This type of query makes up a constant fraction of all queries, so 
we will limit our analysis to this case. A regular Dijkstra search settles 5*-“^ 
and all copies of s, and then it settles the parent of 5^“^ and all its copies, and 
continues to settle the successive ancestors of along with their copies. A total 
of q{k 1) nodes are settled in this way. The backwards search goes through a 
similar process starting at t^^\ For CONTRACTION hierarchies, each node only 
needs to look at neighbors with a higher contraction order than itself. If we are 
using an adjacency list to represent the graph, this can be done by reordering 
the adjacency list based on contraction order. 

Assume initially that we are using Abraham et al.’s contraction ordering, 
which orders nodes by height from the bottom up (we will remove this assump¬ 
tion shortly). So in the forward search, the only nodes we may visit are ancestors 
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of s (in any copy). We refer to this set of nodes as S and recall that it contains 
q nodes at each layer i in the tree. Among the nodes in S with height i, let 
contain the | nodes with lower contraction order than the other | nodes in that 
layer. Let 

k/2 

T=|jT,. (4) 

Suppose is one of the ^ nodes in T. Recall that the shortcut criterion 
for Abraham et al.’s ordering says the path (where u is an ancestor 

of v) will be shortcut if is contracted before Then contracting will 
create at least ^ shortcuts, since is in the bottom half of the tree and has a 
lower contraction order than half of the nodes v in other copies. Therefore, the 
forward search will need to look through at least nodes, making the average 
query take fl{q^k^) time. 

Now we will consider a general ordering by examining the effects of contract¬ 
ing an arbitrary node on edges in g|“2 - 

If tiG) is contracted before a descendant shortcuts from u in any copy 
to vG) will never be created. The number of queries this affects is based on 
the height of u. If m is a leaf, it only affects queries starting from u, but if u 
is higher up in the tree, it will affect all queries starting at leaves with u as an 
ancestor. In effect, we need to weight the nodes based on their importance. We 
do this using where is a descendant of with contraction 

order higher than The value of this sum is proportional to the loss in total 
query time when contracting uG) compared to Abraham et al.’s ordering. Let 
A(r') = i. If all of wG)’s descendants were contracted before the sum would 
be = if because each layer can contribute at most f to the sum. 

There are two cases to consider. 

Case 1: ^ this case, the average query time decreases by 

at most a factor of two, which doesn’t affect our big-Omega bound. 

Case 2: The number of edges in that are lost from 

tiG)’s contraction is < t*, the number of wG)’s descendants. However, contracting 
tiG) before many of its descendants will create many leaf-leaf shortcuts. 

The smallest possible set of contracted descendants would contain the > 
nodes in the top | layers below wG). 

Given two of these nodes xG) and ?/G) -vvith \{x,y) = A(x), a shortcut will 
be created between xG) and ?/G)_ Half of the subtrees rooted at xG)’s children 
will have half of their nodes with contraction order higher than vG), so we will 
gain at least (* 2 ^)(^)^ = -—€ Q{f) extra shortcuts this way. 

Therefore, the number of edges decreases by at most a constant factor, which 
does not affect our big-Omega bound. 

In both cases, we maintain the fi{q^k‘^) bound even with an arbitrary order¬ 
ing. 

Now let k = [Gsdf] and q = h, and we pick t big enough such that qt'^ > n. 
Then the average query for contraction hierarchies is [2{{hlogD)^). □ 


13 



4.3 Lower bounding the size of E~^ 

Abraham et al.’s upper bound of 0{{h\og D)^) on the query time involves prov¬ 
ing that IA+1 G 0{nh log D). The latter bound was proven tight in m- However, 
the proof assumes the contraction order from the algorithm in Abraham et al. 
which is thought to be NP-hard to compute. We show a new proof of this lower 
bound generalized to any contraction order. 

Theorem 4. For all h, D, n, there is a graph G = (F, E) with highway dimen¬ 
sion h, diameter 0{D), and \V\ > n such that for any contraction ordering, 
\E+\ e f2{h\V\\ogD). 

Proof. We will show Gt,k,q satisfies the desired requirements, setting the values 
of t, k, q at the end of the proof. 

We will be concerned only with shortcuts added when contracting leaves. We 
will first count the number of shortcuts added by contracting all of the leaves 
first, as in the preprocessing algorithm by Abraham et al. Recall the criterion 
for creating a shortcut in this ordering, which was stated in Section |4.2| A path 
g(a)_g( 6 )_y;(&)_^(t>) is shortcut if and only if a 7 ^ 6 , ru is a proper ancestor of s, 
and s^^^ is contracted before Then the number of shortcuts added when 
contracting all of the leaves is S' = t^{k){^) G 0{qk\V\) since there are t^ ways 
of picking a leaf, k ways of picking a proper ancestor, and ( 2 ) ways of picking 
two copies. 

In general, the number of shortcuts created for leaf at the time of its 
contraction is the number of ancestors has in 0^°“^ multiplied by the number 
of copies b ^ a, in other trees. We will now consider the effects of arbitrary 
contraction order on the number of edges a leaf has in its own copy at its time 
of contraction. 

Given an arbitrary contraction order 9 and a non-leaf u, let Ci, 1 < i < t, be 
the number of leaves with contraction order higher than v in the subtree with 
u’s Ah child as a root. Then 0 < for all i. 

Contracting v causes J2i=i l^af descendants of v to lose one edge each. 
However, contracting w also increases the number of leaf-leaf edges by GCj- 

Then the net edge gain for contracting v instead of all leaves first is 

t 

^ ^ ^ ^ Ci- (5) 

i¥=j i=l 

In order to find the minimum value of Ay^g, we consider four cases. 

Case 1: > 3 cfs are nonzero. Without loss of generality, let the cfs make 
a decreasing sequence. So ci > C 2 >•••> Ct > 0 and C 3 > 1. Then C 1 C 3 > 
Cl, C 1 C 2 > C 2 , C 2 C 3 > C 3 ,..., Ct_iCt > Ct- It follows that 

t 

Av,e = ^ CiCj - ^ Ci > 0. (6) 

j/i i=l 
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Case 2: Exactly two Ci’s are nonzero. So ci > C2 > 1 and C3 = C4 = • • • = 
Ct = 0. Then g = C1C2 — Ci — C2 = (ci — l)(c2 — 1) — 1. If C2 > 1, then 
(ci — 1) > (c2 — 1) > 1, so > 0. If C2 = 1, then ci — 1 = 0, so Ay^g = —1. 

Case 3: Exactly one Ci is nonzero. So ci > 1 and C 2 = C 3 = • • • = Ct = 0. 
Then Ay^g = —Ci, so the minimum value of Ay^g in this case is . 

Case 4: All c^’s are zero. Then clearly Ay^g = 0. 

Therefore, the minimum value of Ay is from case 3. 

Note that the possible leaf-leaf edges we gain from contracting a non-leaf v 
are independent of other leaf-leaf edges we gain from contracting another non¬ 
leaf u: if X{v) = X(u), the leaves in the edges must be different since they cannot 
have both v and u as an ancestor. If X{v) ^ X{u), the edges must be different 
since the lowest common ancestors between the endpoints of each edge are at 
different heights. 

So given an arbitrary contraction order, the number of leaf-edges within a 
copy (at the time of the leaf’s contraction) is 


k k 


kt'^- ^ Ay^g 



-1 = = kt'^-kt^-^ = 

kt'^-\t-l). 


i=i 








(7) 

Then 






\E+\ > ( 

v2y 

]kt^-^{t-l) G G{kq\V\) 

(8) 

We let k = 

and q 

= 

h, and we pick t such that qt^~^^ 

> n. Then 


G has highway dimension h, diameter 0{D), and has \V\ > n. Finally, given a 
contraction order 9, \E^\ G Q{h\V\\ogD). □ 

5 Transit Node Routing 

TRANSIT NODE ROUTING [1] was devised in 2007 by Bast et ah, and it (and 
variants) remain the second-fastest family of routing algorithms, behind hub 
LABELING [6]. However, transit node routing requires about an order of 
magnitude less space than hub labeling. In this section, we first review the 
TRANSIT NODE ROUTING algorithm, and then we give a lower bound on the 
query time. 

The algorithm works by picking a set T <ZV oi transit nodes that hits many 
long-distance shortest paths. |r| is often chosen to be in 0(\/|E|), which makes 
the algorithm run fastest while maintaining that additional memory require¬ 
ments are bounded by the input graph size. Usually, the contraction order is 
used to pick T (since contraction order essentially seeks to measure a node’s 
importance with respect to shortest paths), which works well in practice. 

Next, given any node n, A(v) C T is the set of that node’s access nodes, 
which are chosen to hit the long-distance queries stemming from v. This usually 
means that we want to pick nodes in T that are close to v. 
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The distances between all pairs of transit nodes are computed and stored, 
as well as the distances between a node v and each of its access nodes. A query 
is called a global query if min(dist(s, m) + dist(M,u) + dist(u,t) | u G A(s), v G 
A(t)) = dist(s,<). Otherwise, it is a local query. To run an s-t query, first run 
a quick locality filter that determines whether the query is local. This filter 
is allowed to make one-sided errors; it can misclassify a global query as local, 
but not the other way around. Locality filters are historically calculated using 
the coordinates of the vertices. If it is a global query, calculate the minimum 
dist(s, u) + dist(u, v) + dist('(;, t) by trying all combinations of access nodes from 
A(s) and A{t). Local queries are handled by a fast local search such as CON¬ 
TRACTION HIERARCHIES. 

Abraham et al. use a choice of T based on multiscale shortest-path covers 
to prove that access nodes are bounded in size by 0{h), from which it follows 
that global queries can be handled in Olh?) time. Local queries done using 
CONTRACTION HIERARCHIES Can be handled in 0{[h\ogDY) time as we saw in 
the previous section (however, local queries tend to be small, making the queries 
run much faster than the average CONTRACTION hierarchies query). 

This bound is not possible without the new definition of highway dimension. 
Again, if we want polynomial time preprocessing, the query time bound for 
global queries increases to 0((Alog 

5.1 Lower bounding the query time 

While the upper bound for transit node routing was for global queries only, 
our lower bound will include both local and global searches. We will use CON¬ 
TRACTION HIERARCHIES for local queries. 

Theorem 5. For all h, D, n, there is a graph G = {V,E) with highway dimen¬ 
sion h, diameter 0{D), and \V\ > n such that for any choice of transit nodes T 
and access nodes A, the average query time is f2{h^). 

We work up to the proof of Theorem using a series of definitions and 
lemmas. 

Call a leaf-leaf shortest path regular if the shortest path is global and neither 
endpoint is a transit node. We would like to exclude irregular shortest paths 
from our analysis. 

First, we show that queries with a transit node as an endpoint do not make 
up a constant fraction of all queries. Since |T| < the number of shortest 

paths in which at least one endpoint is a transit node is 0(|fo|\/|fo|) G o{\V\^). 

Next, we consider the case in which local queries make up at least a 
fraction of total queries. In the previous section, we showed in Theorem that 
the average query for contraction hierarchies requires fl{{h\ogD)^) time. 
The proof showed a constant fraction of all queries required this amount of time. 
If we lower all of the constants in the proof, we can show that given any set of 

of total queries, a constant fraction of those queries require l7((/ilogZ?)^) 
time (thus a constant fraction of all queries require that amount of time). This 
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big-Omega bound is higher than the one we seek to prove for global queries, so 
for the rest of our analysis we can assume that a < ^ fraction of total queries 
are irregular. In particular, this means a constant fraction of the total queries 
are regular. 

There is a simple intuition for the rest of the proof. Given a regular shortest 
path either or must have an access node in the other’s copy, 

since the non-endpoint vertices on the shortest path all come from one copy. The 
proof becomes technical because we must show that a constant fraction of leaves 
have a large amount of access nodes in distinct copies and subtrees. But we are 
able to show that a constant fraction of the nodes need access nodes, and 

the proof follows. 

Lemma 7. If the number of local leaf-leaf queries is a < ^ fraction of total 
queries, then there is a set of | copies in which there are ^ leaves that are each 
an endpoint of ^ regular shortest paths going to at least | different copies. 

Proof. Assume the number of local leaf-leaf queries is o(|Gp), but assume the 
lemma is false. Then there must be > | copies with the following property: > G 

leaves are each endpoints of < ^ regular shortest paths going to > | copies. 

Now consider the maximum number of regular leaf-leaf shortest paths possi¬ 
ble in Gt^k.q under that assumption. Making all four inequalities tight, we have 
I copies with G leaves each as endpoints of ^ regular shortest paths going to 
I copies each. In other words, in half of the copies, half of the leaves each have 
the property that in half of the copies, half of the shortest paths going from that 
leaf to the copy are regular. This means that at the very least | | • 5 ■ | = 
of all leaf-leaf shortest paths must not be regular. 

This violates one of our assumptions, so we have a contradiction. □ 


Now we have the machinery necessary to prove Theorem 


Proof. We will show that Gt^k,q has the desired properties, with the values of t, 
k, and q to be defined at the end of the proof. 

From our previous argument at the start of this subsection, we need only 
consider the case where < of all queries are irregular. 

We use Lemma to define a set S of regular shortest paths such that there 
are exactly | copies that have exactly G leaves with exactly ^ regular shortest 
paths in S going to | copies. 

Then 


1^1 > 


f'” q t^ 

y 2' T 


32 


(9) 


We added another factor of ^ because these shortest paths can be double counted. 

Given a path P G S, P's endpoints are two leaves and in differ¬ 
ent copies and must be of the form s^°'Gs^^'>-yjW g(a)_y;(o)_^(a)_^(b) i,y 
Lemma[^ Without loss of generality, assume that P is s^G-gib)-yj{b)_^{b)^ Since 
the path is global, must have an access node on P. The access node can’t 
be itself since P is regular. Therefore, the access node must be in 
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This access node hits at most ^ paths in S stemming from because that 
is the total number of shortest paths in S from to a leaf in Cfl- 

So given an arbitrary path in S, we have shown that an access node for some 
node must exist that can hit at most ^ other shortest paths in S. Then the 
total number of access nodes needed in S is at the very least 


^2^2fc ^ 

32 ^ Y 




( 10 ) 


As in earlier proofs, we let k = and q = h, and we pick t such that 

qtk+i > Then G has highway dimension /i, diameter 0{D), and has \V\ > n. 

Queries in which both endpoints’ access node sets are f2{h) will take ^{h?) 
time, and these make up a constant fraction of all global queries. □ 


6 Conclusions and Future Work 

We proved lower bounds on the query time of hub labeling, contraction 
HIERARCHIES, and TRANSIT NODE ROUTING. The proofs are all quite different, 
despite using the same family of graphs for each proof. We also generalized a 
lower bound on the size of E'^ in CONTRACTION hierarchies preprocessing, 
and established hardness for optimal preprocessing in hub labeling. 

Although we have proven lower bounds for the query times of three state- 
of-the-art algorithms, the graphs used in the arguments are not representative 
of real-world graphs. For instance, the graphs do not have small separators and 
are not planar. This implies it may be possible to circumvent this lower bound 
using different properties that better capture the structure of real-world graphs. 

Another way to work with more realistic road networks is to use the idea 
of multiscale dispersed graphs, defined in [5], as a new model for graphs that 
simulate real-world graphs. One may be able to obtain better bounds on the 
query time with this model. 

Throughout this paper, we assumed undirected graphs, so future work could 
extend these results to the directed case. Furthermore, apart from hub label¬ 
ing, the upper and lower bounds are not tight because of the different definitions 
of highway dimension. Ideally, we would find a way to prove the lower bounds 
using the more recent definition of highway dimension. However, we cannot use 
Gt^k,q for this task. Under the new definition, Gt^k,q has highway dimension at 
least q + k, since the new definition guarantees a graph’s degree is bounded by 
its highway dimension. 
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